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A promising method for calculating free energy differences AF is to generate non-equilibrium 
data via "fast-growth" simulations or experiments - and then use Jarzynski's equality. However, a 
£Nj , difficulty with using Jarzynski's equality is that AF estimates converge very slowly and unreliably 

due to the nonlinear nature of the calculation - thus requiring large, costly data sets. Here, we 
present new analyses of non-equilibrium data from various simulated molecular systems exploiting 
statistical properties of Jarzynski's equality. Using a fully automated procedure, with no user-input 
parameters, our results suggest that good estimates of AF can be obtained using 6-15 fold less 
data than was previously possible. Systematizing and extending previous work Q], the new results 
exploit the systematic behavior of bias due to finite sample size. A key innovation is better use of 
the more statistically reliable information available from the raw data. 
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I. INTRODUCTION 



The calculation of free energy differences, AF plays an essential role in many fields of physics, chemistry and biology 
[EIEIEI^IMIEI^IEI^ EO, llll E3, E3, EH EH E3| Examples include determination of the solubility of small 
molecules, and binding affinities of ligands to proteins. Rapid and reliable estimates of AF would be particularly 
valuable to structure-based drug design, where current approaches to virtual screening of candidate compounds rely 



primarily on ad-hoc methods 0> 13 • F ree energy estimates are also critical for protein engineering 0, [21, 

The focus of this report is non-equilibrium "fast-growth" free energy calculations . These methods hold promise 
- yet to be fully realized - for very rapid estimation of AF. The central idea behind the non-equilibrium methods 
O • is to calculate the irreversible work during a very rapid (thus non-equilibrium) switch between the two systems or 
states of interest. Multiple switches are done, and the resulting set of work values can be used to estimate AF using 
Jarzynski's equality (detailed in Sec.[H]) 22] . 

Somewhat surprisingly, non-equilibrium AF calculations are critical for analyzing single-molecule pulling experi- 
ments fO. Il5|. In essence, these experiments generate non-equilibrium work values, as pointed out by Hummer and 
Szabo, so the only way to estimate the free energy profile is to use Jarzynski's equality yj,[23|- The methods that we 
develop in this report should be equally useful for analyzing such experiments. 

It has been accepted for some time that there are three sources of error [24( for non-equilibrium AF calculation: 
(i) inaccuracy of the force field [2II, (ii) inade quat e sampling of the configurational space j2(| I27I I2H , 12^. l3Cj . and 
1 (iii) bias due to finite sample size 111 

Him SlUlii. Error 

in free energy calculations have been of long-standing 

interest, e.g. Refs. [MElMlIllIII 

The present study addresses only source (iii) , and attempts to determine the most efficient use of fast-growth work 
values. In other words, given a (finite) set of work values generated by simulation or experiment, what is the best 
estimate for AF? We do not here attempt to prescribe the best method for generating non-equilibrium work values. 

We proceed by first introducing two new block averaging techniques, based on the original proposal by Wood et 
al. H3- Block averaging provides well-behaved, but biased, AF estimates. We then discuss two distinct schemes for 
extrapolating to the "infinite data limit" . Our work systematizes and extends previous work by Zuckerman and Woolf 
0, who originally proposed the use of block averages for extrapolation. 

Methods to lessen the effect of bias due to finite sample size have been proposed for the case when switching 
between systems is performed in both directions 

nun 

]44j. and for the simplified case in which the non-equilibrium 
work values follow a quasi-Gaussian distribution [16l |45j . Hummer also considered errors in non-equilibrium AF 



calculations [46j. To our knowledge, however, other workers have not addresses uni-directional switching in highly 
non-Gaussian systems. 

The techniques outlined in the following sections offer rapid estimates for AF for the systems we studied - namely, 
the chemical potential for a Lennard- Jones fluid, "growing" a chloride ion in water, methanol — > ethane in water, 
and palmitic — > stearic acid in water. Work values for these systems follow highly non-Guassian distributions. We 
compare our extrapolated results to AF obtained by using Jarzynski's equality, finding a 6-15 fold decrease in the 
the number of work values needed to estimate AF for the test systems considered here. 
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II. FAST-GROWTH 

Fast-growth techniques have been described in detail elsewhere p], |2^, ^(J , so we will simply outline the method. 
Consider two systems defined by potential energy functions Uq and U\. To calculate the free energy difference AF 
between these two systems, one must simply "switch" the system from Uq to U%. This is readily accomplished by 
defining a switching parameter A such that 

C/ A (x) = (7 (x) + a[[/i(x) - f/ (x)] , (1) 

where x is a set of configurational coordinates, and U\(x) describes the "hybrid" potential energy function for all 
values of A from to 1. We note that nonlinear scaling with A is also possible 0, EJ El El E3 resulting in hybrid 
potentials differing from Eq. (JJJ. Our approach here also applies, in principle, to other such choices. Essentially, 
the idea behind fast-growth methods is to perform rapid switches from A = — > 1, where each switch is generated 
starting from coordinates drawn from the equilibrium ensemble for A = 0. During each switch, the irreversible work 
is accumulated, generating a single work value. Multiple switches are done to generate a distribution of these work 
values p(W). 



A. Simple Estimators 



It has been appreciated for some time that the average work obtained over many such switches provides a rigorous 
upper bound for the free energy difference, 

AF < (W) , (2) 

where the (•■■)() represents an average over many switches starting from the equilibrium ensemble for A = and ending 
at A = 1. Equality occurs only in the limit of infinitely slow switches. Further, if the distribution of work values 
p{W) is Gaussian (this occurs if the system remains in equilibrium during the switch, but also may occur in certain 
far from equilibrium situations) , then the high temperature expansion of Zwanzig |48| gives 

AF=(W\-^a w , (3) 

where aw is the standard deviation of p(W), and (3 = 1/fcgT where T is the temperature of the system and Ub is 
the Boltzmann constant. 

However, for the fast-growth work values under consideration here, the distribution of work values can be very 
broad and non-Gaussian. Thus Eqs. J5J and |JS} will not provide reasonable estimates of AF; see Fig.^ It is possible 
to use higher order moments to estimate AF (see for example Refs. [la IS EE E3 ) • These estimators are most useful 
in the near-equilibrium regime. 



B. Jarzynski Equality 



Due to recent work by Jarzynski |22j, |5(J, |51| , it is possible to estimate AF using these fast-growth W values via 
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This remarkable relationship is valid for arbitrary switching speed, implying that one can perform switches as rapidly 
as desired and still obtain valid estimates of AF. The Jarzynski equality thus provides an estimate for AF for a set 
of N work values given by 
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The AF estimates given by Eq. JSJ), however, are very sensitive to the distribution of work values p(W) [H l3ll l32]| . 
If the width of the work distribution large, i.e. aw 3> kgT (this implies a very rapid switch and/or a complex system), 
then often thousands, or even tens of thousands of work values are needed to reliably estimate AF. An example of 
this can be seen in Fig. where a histogram of work values is shown for PAL2STE (described in Sec. lIV|) . The value 
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FIG. 1: Distribution of work values for PAL2STE test system (palmitic — > stearic acid mutation in water, described in Sec. 
II VI . Also included in this plot are the estimators given by Eqs. and © shown by the blue dot-dash and green dashed 
line respectively. The solid black line shows the AF estimate obtained by using Jarzynski's equality in Eq. © for all available 
data. 



of AFj arz given by the Jarzynski equality, as well as estimators (W) and (W) Q — ^(3a Wl are shown on this plot. 
This graphically demonstrates why Eqs. (j2J and (0 are often poor estimates of the free energy for fast-growth work 
values. 

If the switch is performed instantaneously, then Eq. J5J becomes 

e -/3AF = ^ e -/3[i7 1 (x)-C7o(x)]^ ^ ^ 

often called single-stage free energy perturbation [48l l52|. In this limit, the system is not allowed to relax at any 
intermediate values of A. Instead Ui(pc) is simply evaluated at values of x drawn from the equilibrium ensemble for 
A = 0. The advantage of this method is that data can be generated very quickly. However, in practice, unless there is 
sufficient overlap between the states described by U and Ui, the estimate of AF will be biased, often by many k B T 
|53l l54j . The problem of attaining overlap of states can be improved by drawing from the equilibrium ensemble for 
an unphysical "soft-core" state (such as for A = 0.5) [2§Hri| . 

Recent work by Hendrix and Jarzynski showed that essentially the only determining factor in the accurate 
calculation of AF was physical CPU time spent during the calculation. So, doing many rapid switches has no 
advantage over doing fewer slower switches. This conclusion is based upon using Eq. JSJ for all AF estimates. 

This manuscript describes methods that exploit statistical properties of Jarzynski's equality, allowing us to do use 
work values from very rapid switches and obtain AF estimates with 6-15 fold less work values than using Eq. JSJ. 



III. OTHER METHODS 



To calculate reliable AF estimates using less work values, one can generate a narrower p(W), i.e. perform the 
switching process more slowly. However, slower switching speed means that more computational time will be spent 
to generate each work value - offsetting some of the advantage gained by doing rapid switches. 

If the switch is performed so slowly that the system remains near equilibrium during the switch, then the width of 
the distribution will be very small (aw < ksT), and thus only a few work values are required for accurate estimation 
of AF This slow-growth method is, in principle, equivalent to thermodynamic integration j5|| where AF is 
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calculated by allowing the system to reach equilibrium for each value of A. Then AF is found using 




A 



(7) 



Thermodynamic integration and slow-growth can provide very accurate AF calculations, but are also computationally 
expensive [E E3, - 

As previously mentioned, the equilibrium ensemble, when using the Eq. ©. is generated for A = 0. Then p(W) is 
generated by doing switches from A = — > 1 (forward switches) with configurations drawn from the A = ensemble. It 
is also possible to generate another equilibrium ensemble for A — 1 and then perform reverse switches from A = 1 — ► 0. 
It has been shown that, if one combines the use of the forward and reverse work values, convergence is much more 
rapid then doing just forward switches |42l l43l E3. IHflj. It has been recently demonstrated that most efficient use of 
forward and reverse work values is for Bennett's method 0,E2E3l- 

There is, however, a distinct advantage to using Jarzynski's estimates with only forward switches, when one considers 
the eventual goal of predicting relative binding affinities for application in drug design. In this situation, if using 
Jarzynski estimates, one need only generate a single high-quality equilibrium ensemble for a particular ligand-receptor 
or reference complex. Then one can determine relative binding affinities for other ligands without generating another 
equilibrium ensemble - a significant decrease in computational expense. 



To show the generality of the methods proposed in this study we consider four test systems with varying molecular 
complexity: a chemical potential calculation for a Lennard- Jones fluid, "growing" a chloride ion in water, methanol 
— ► ethane in water, and stearic — > palmitic acid in water. 

The last two systems are alchemical mutations of fully solvated molecules (see Refs. [l|,|5|| for simulation details), 
and the first system is a chemical potential calculation done by the par ticle insertion method (see Ref. for details). 
All three of these data sets were generated previous to this study • 

The growing chloride system was studied u sing TINKER version 4.1 [§l]], with the simulation conditions chosen to 
closely match those of Lybrand et al. in Ref. [57] . Stochastic dynamics simulations were carried out in the canonical 
ensemble (constant N, V, T) in a cubic box of edge length 18.6216 A. The temperature was held at 300 K by a Berendsen 
thermostat with a time constant of 0.1 [f^. The chloride ion was modeled with Lennard- Jones parameters a = 4.4463 
A and e = 0.1070 kcal/mol, and was solvated by 214 SPC water molecules. Ewald summation approximated charge 
interactions and RATTLE was used to hold the water molecules rigid 63]. For this test system, the Lennard- Jones 
"size" was increased by 1.0 A, from a = 4.4463 A at A = to a — 5.4463 A at A = 1. 

To obtain fast-growth work values, a time step of 1.0 fs was used. The system was equilibrated for at least 10 ps, 
after which starting configurations for each fast-growth trajectory were generated every 100 time steps. 

Below we list the notation used to refer to each data set. Also included are statistical features of the data sets - 
the total number of work values (N to t), the mean work (/ Wj) and the standard deviation (aw)- These data sets are 
all considered difficult to use for AF calculations owing to the facts that aw 3> ksT and (W) — AF > lOfc^T; see 
Eqs. © and © and Fig.Q] 

L J - Chemical potential calculation for a Lennard- Jones fluid in 1 A-step |50j . This corresponds to instantaneous 
switching or free energy perturbation, as described in sec.|n] Ntot — 100, 000, (W) — 305.1 fc^T and aw = 83.5 
ksT. Using all work values, Eq. © gives a best estimate AFbest = 0.7 fc^T. 

GROWCL - Grow chloride by 1.0 A in 10 A-steps with 1 relaxation step at each value of A. Ntot — 40,000, 
(W) — 40.1 kcal/mol and aw = 8.6 kcal/mol. Using all work values, Eq. © gives AFbest = 18.4 kcal/mol. 

METH2ETH - Methanol to ethane mutation data using 200 A-steps with 1 dynamic relaxation step at each 
value of A Q. N to t = 9,600, (W) = 37.0 kcal/mol and a w = 12.3 kcal/mol. Using all work values, Eq. © 
gives AFb ea t = 7.4 kcal/mol. 

PAL2STE - Palmitic to Stearic acid mutation data using 55 A-steps with 10 relaxation steps at each value of 
A HH. Ntot = 20,000, (W) = 28.6 kcal/mol and a w = 7.5 kcal/mol. Using all work values, Eq. © gives 
AFbest = 15.2 kcal/mol. 

Since the goal is to determine the best analysis for a given set of work values, we assume that the true AF is given 
by Eq. © using all available work values (i.e. AFbest above). Determining whether the distribution of work values 
p(W) used in this paper are complete and representative is beyond the scope of this report. 



IV. 



TEST SYSTEMS 
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FIG. 2: The running Jarzynski estimate, given by Eq. ©, as a function of the number of work values used in the estimate, 
N is shown as a solid blue line. The dashed red line shows the sub-sampled block averaged free energy estimate given by Eq. 

plotted as a function of the number of work values in each block, n. Data used for these estimate were obtained from the 
LJ test system (chemical potential for a Lennard- Jones fluid). The Jarzynski estimate displays erratic convergence behavior, 
while the block averaged free energy estimate displays a smooth monotonically decreasing estimate 



V. 



BLOCK AVERAGING 



The motivation for using block averages can be seen in Fig. [21 (see also Refs. @:|41|). The solid blue line is a running 
estimate for AF obtained by using Eq. (JSJ) and the dashed red line is obtained by block averaging. Both curves are 
obtained using the LJ test system. The running Jarzynski estimate exhibits very poor convergence behavior, making 
it very difficult to establish when a reliable estimate of AF has been obtained. The block averaged free energy, 
however, displays a smooth monotonically decreasing AF estimate, which approaches the true AF. 

Each block averaged free energy (AF„) data point was obtained from a set of N work values (W\, W2, Wn) using 
the following scheme |4l|: 

1. Draw n work values at random from the set, generating a subset iW\, W2, W n ). This is now the j th block of 
work values. 

2. Use Jarzynski's equality, Eq. JSJ to obtain a free energy estimate Fj for this block 



In 



\i£ block j / 



(8) 



3. Repeat steps 1 and 2 until you have m blocks, each containing n values. Now the average (AF n ) and standard 
deviation (er„) can be calculated using 
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This process is carried out for every possible value of n (i.e. n = 1,2, 3, N). 

In previous work m = N/n was chosen, i.e. m is the number of blocks of size n from a data set of size N. 

The weakness of this choice is that a reshuffling of the data set gives a new (generally different) set of AF„ values. 
To avoid this weakness we choose m large enough that the resulting AF„ values do not depend upon the value of m. 
This is typically accomplished with m ~ 100 x N/m. 

Since there are two distinct ways of randomly drawing from a data set (i.e. implementing the first step above), we 
introduce two new block averaging schemes. The first is to draw work values from (Wi, W2, Wjv) at random with 
replacement - i.e. it is possible to draw a particular work value more than once. We call this a bootstrapped AF n 
|6J|. The second is to draw from (W%, W2, Wn) at random without replacement. We call this a sub-sampled AF„ 

The difference between the bootstrapped and sub-sampled methods can be illustrated by considering a data set 
of N work values where N — 1 values are large and one value is very small. Due to the highly nonlinear nature of 
the Jarzynski equality, the single small work value will dominate Eq. ©. Suppose one calculates AF n for n = N 
using both of these methods. The sub-sampled method will only have one AFn estimate since reshuffling the work 
values has no effect when n = N. However, the bootstrapped method calculates a AFn value that is larger than the 
sub-sampled AFn due to the fact that it will draw the small work value only a fraction of the time. A generalization 
of this argument shows that the bootstrapped AF n will exceed the sub-sampled AF n for every value of n. 



VI. EXTRAPOLATION METHODS 



Now that a smooth function has been obtained in the block averaged free energy AF n , shown in Fig.[5J extrapolation 
to the infinite data limit becomes feasible, as originally suggested by Zuckerman and Woolf [lj. The basic idea is 
to plot AF n as a function of some variable and then extrapolate to the infinite data limit (n — > 00). It is useful to 
plot AF n as a function of x = 1/« T as shown in Fig. [3] [jj . The plot was generated by choosing 100 work values 
at random from the PAL2STE data set. AF n was then computed for this subset of 100 work values following the 
steps outlined in Sec. [V] In this plot the bootstrapped AF n is shown, and the best estimate AFb est is included as the 
solid black line. A value of r = 0.22 was chosen to minimize the slope of AF n (x) as discussed below. The errorbars 
show the statistical uncertainty of the AF n given by the standard error associated with a n in Eq. (|10fl . The smallest 
uncertainty occurs for x = n = 1 due to the fact that AF n {x = 1) is simply the average work. 

It is useful to plot AF n as a function of x — l/n T as in Fig. [5] (rather than n) because the infinite data limit 
(n — > 00) now corresponds to x — 0- In addition, this simple form gives a bounded interval (x — (0, 1]), rather than 
an infinite one (such as AF n as a function of n) . This form allows us to develop two simple extrapolation schemes as 
explained in the following sections. 



A. Linear Extrapolation 

It is known that the block averaged free energy AF n in Eq. guarantees monotonic behavior 0,0, El- Thus, one 
can hope to obtain a reasonable estimate of AF by simply continuing the curve in Fig. [3] with a straight line. Such a 
linear extrapolation guarantees that our extrapolated results will not exceed AF n for n = N — neccessary since AF n 
is a rigorous upper bound for the true AF [24| . 

We test this extrapolation method using the test systems described in IIVI This fully automated process contains 
the following steps: (i) Draw a subset containing N work values (Wi, W2, Wn) at random from the data set. (ii) 
Plot the bootstrapped AF„ as a function of x = I/ 71 - 1 "; vary r, then choose the value of r that minimizes the slope of 
the tail (i.e. small x) of AF n . (If one has enough data to get the correct AF then, for the right value of r, the slope 
will be nearly zero.) (iii) Extrapolate AF„ to x = using a straight line. The intercept (x = 0) is our extrapolated 
free energy AFu n . (iv) Using these same N work values, estimate the free energy AFj arz with Eq. (JSJl. This process 
is repeated 500 times to obtain the average and standard deviation of our AFu n and AFj arz . 

A simple extension of the linear method shown here, is to fit AF„ to a nonlinear function, such as quadratic in Xi as 
in previous work by Zuckerman and Woolf lj. These nonlinear extrapolation methods offer little, if any, improvement 
in the average AF extrapolations. And, due to the inherent instability of high order fits, the standard deviations for 
the extrapolated results are much larger than those obtained for linear extrapolation. 
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FIG. 3: Bootstrapped block averaged free energy (AF n ), given by Eq. © as a function of \ are shown as red squares. The solid 
black line represents the best estimate AFt, est . The value of r = 0.22 is chosen to minimize the slope of AF n (x) as described 
in Sec. IvTaI This plot was generated using 100 work values chosen at random from the PAL2STE test system (palmitic — > 
stearic acid mutation). In this plot, extrapolating to the infinite data limit cooresponds to continuing the AF n curve to \ = 
to obtain the intercept. This plot also demonstrates that the large y (small n) data are more reliable as shown by the errorbars 
which represent the standard error of AF n . 



B. Reverse Cumulative Integral Extrapolation 

As previously metioned (see Fig. the most precise AF„ values occur for larger \ ~ 1 (i- e - smaller n), yet the 
previous linear extrapolation scheme relies exclusively on small \ values. Thus, in an effort to use the more precise 
large- x data to extrapolate AF, we now formulate an integration scheme which explicity includes all values of x- 

Consider treating AF n in Fig. [31 as a smooth function AF n (x), from x — to 1. We are free to consider the area 
under this function, re-written using integration by parts, 



1 d X (l-x) d -^ + AF n (x = 0). 



/o Jo d X 

But AF n (x = 0) is just the extrapolated free energy estimate AF rci , so 



(x) - (i - x) 



dAF n ( X ) 
dX 



Now the reverse cumulative integral function can be defined by 

RCI( X ) -- 



(11) 



(12) 



(13) 



where it should be noted that we accumulate in the reverse direction from x' = 1> where the data is most precise, to 
X 1 = X, i- e - from right to left in Figs. 01 and 0] 

A sample plot of the reverse cumulative integral is shown in Fig. 01 This plot was generated using two subsets 
(represented by open and closed symbols) of 100 work values drawn at random from the PAL2STE data set. The 
solid black line shows the best estimate AF{, est , the blue squares are the sub-sampled AF n and the red circles are 
RCI(x)- For each of the two subsets, the value of r was chosen to minimize the slope of the tail of RCI(x), as discussed 
below. The subset represented by the open symbols slightly overestimates Ai^est, while the subset represented by 
the closed symbols slightly underestimates AFb est . 
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FIG. 4: Examples of the reverse cumulative integral, RCI(x) are shown for a two subsets of 100 work values drawn at random 
from the PAL2STE data set (palmitic to stearic acid mutation). The first subset is represented by open symbols and the 
second by closed symbols. The solid black line shows the best estimate AFt, est , the blue squares are the sub-sampled AF„ and 
the red circles are the RCI(x)- The strength of using RCI(x) for extrapolation is its explicit use of all the AF n values. For 
each subset, the value of r was chosen to minimize the slope of the small- x tail of RCI(x), as described in Sec. I VI 51 In this 
example, the subset represented by the open symbols slightly overestimates AFbest, while the subset represented by the closed 
symbols slightly underestimates AFbest- 



To obtain an extrapolated value for AF, consider the case where one has more than enough data to obtain AF 
exactly. In this situation, if t is chosen carefully, RCI(\) will have nearly zero slope for small x, since accumulating 
more x values will not change the estimate. Thus, one can hope to extrapolate AF by simply finding a value of r 
where the slope dRCI(x)/dx is the smallest for small x, then the extrapolated free energy AF rci will be the value of 
RCI(x) for the smallest value of x available, Xmin- 

Our fully automated test of this new extrapolation method is very similar to that described in the previous section, 
with only minor differences: (i) the sub-sampled AF n is used, (ii) the value of r is chosen to minimize the slope of the 
tail (small x) of RCI(x) - see Fig. 0J and (iii) once the value of r is determined, the free energy is estimated to by 
AF rc i — RCI(xmin)- Comparison is made with the Jarzynksi estimate AFj ar z using the same procedure as in the 
last section. 



VII. RESULTS 



The initial results of this study are very positive as shown by the rapid convergence of our extrapolated AF estimates 
(Fig. 01. Compared to AFj arz , estimates of AF can be made with 6-15 fold less work values, i.e. less computational 
expense. 

Fig. demonstrates how the linear and reverse cumulative extrapolation (RCI) methods described above compare 
to using the Jarzynski estimate of Eq. (JjjJ, for each of the four test systems. For all of the plots shown, the solid 
black horizontal line corresponds to the Jarzynski estimate using all available work values and thus represents the best 
estimate AF, est from Sec. IIVI The red squares are averages of AFj arz using Eq. (J5J , the blue triangles are averages 
of AFu n from Sec. IVI Al and the green circles are averages of Af ra from Sec. IVI Bl The inset for each plot shows the 
standard deviation of the AF estimates (cta_f)- Averages and stardard deviations were obtained by performing 500 
independent trials for each estimate (AFj arz , AFu n , AF rc j) for every value of N. Thus o~af indicates the expected 
statistical uncertainty - that is the range of values one would expect if the calculation was performed de novo. 

A glance at Fig. [5] reveals that the linearly extrapolated AFu n estimates converge to the best estimate AFt, est more 
quickly than the Jarzynski estimate AFj arz . The larger uncertainty of the linearly extrapolated estimates is, at least 
partially, explained by the fact that it relies on the less certain AF„ values as explained in Sec. IVI Al Also, the linear 
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FIG. 5: A comparison between AF estimates for linear extrapolation, reverse cumulative integral (RCI) extrapolation, and 
the Jarzynski equality for all of the test systems. For each of the plots the solid horizontal black line indicates the best estimate 
AFbest given in Sec. IIVI the red squares are averages of Jarzynski estimates given by Eq. 0, the blue triangles are averages 
of linearly extrapolated estimates from Sec. IVI Al and the green circles are averages of RCI extrapolated estimates from Sec. 
IVIBI The inset in each plot shows the standard deviation for each of the estimates. Averages and stardard deviations were 
obtained by performing 500 independent trials for each estimate for each value of N. 



estimates tend to "overshoot" AFb es t- 

Many of the disadvantages of the linearly extrapolated estimates are somewhat overcome by RCI extrapolation. 
Since RCI extrapolation relies heavily on the more precise values of AF n , the uncertainty is generally smaller than 
that of the linear estimates. Remarkably, for the LJ system, the RCI extrapolated uncertainty is smaller than the 
Jarzysnki estimate uncertainty for N > 40. Also, RCI extrapolated estimates do not tend to appreciably overshoot 

To obtain a quantitative comparison between RCI extrapolated estimates AF rc i, and Jarzynski estimates AFj arz , 
we ask the following question: how many work values are necessary to obtain a AF estimate that falls within 1.0 
kcal/mol of the best estimate AF{, est 7 Table [|] summarizes the results of this comparison. The RCI estimates offer 
a significant improvement over the Jarzynski estimates in all of the test systems, with a 6-15 fold decrease in the 
number of work values needed to estimate AFb est within 1.0 kcal/mol. 

Due to the fact that the linearly extrapolated estimates tend to overshoot AFb est , often by many kcal/mol, a 
quantitative comparison between AFu n and AFj arz would be difficult and unreliable. Thus, we do not attempt to 
make such a comparison here. 
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TABLE I: A quantitative comparison between the reverse cumulative integral estimates (AF rc i) and the Jarzysnki estimate 
(AFjarz) shown in Fig. The first column shows the test system used in the comparison. The second and third columns are 
the number of work values needed to obtain an estimate that falls within 1.0 kcal/mol of AFbest for the reverse cumulative 
integral (N rC i) and Jarzynksi (Nj arz ) estimates. The rightmost column is the ratio of these two values, i.e. the approximate 
improvement of the reverse cumulative integral estimate over the Jarzysnki estimate. 



System 


N rci 


Nj ar z 


Improvement 


LJ 


800 


6000 


7.5 


GROWCL 


200 


3000 


15 


METH2ETH 


400 


2500 


6.25 


PAL2STE 


40 


500 


12.5 



VIII. CONCLUSION 

We have described two methods that improve standard non-equilibrium estimates of free energy differences, AF: 
linear extrapolation and reverse cumulative integral (RCI) extrapolation. Four test systems were used in this study: 
chemical potential calculation for a Lennard-Jones fluid, growing a chloride ion in water, methanol — > ethane mutation 
in water, and palmitic — > stearic acid mutation in water. Both of the methods rely on block averaged free energies 
AF n , which are extrapolated to the infinite data limit, and offer more rapid estimates of AF than using the Jarzynski 
equality alone, for the test systems considered here. 

Previous work by Zuckerman and Woolf |lj used a quadratic extrapolation method to estimate AF. The present 
study offers several improvements: (i) the accuracy and uncertainty of the extrapolated estimates are reduced due 
to improved, fully automated methods; (ii) two new methods for calculating the block averaged free energies, AF n 
are described; (iii) a key innovation is offered in RCI extrapolation in its use of the more reliable AF n data; (iv) 
a systematic quantitative comparison is done between the RCI and Jarzynksi AF estimates, showing a 6-15 fold 
decrease in the number of work values needed for the RCI estimates; (v) we have tested our extrapolation methods 
on four systems of varying molecular complexity. 

For the first time, bootstrapped and sub-sampled block averaged free energies are introduced. These AF n offer 
very smooth convergence properties allowing statistically reliable extrapolation. The ability to generate smooth AF n 
data is critical to the extrapolation methods described here. 

A quantitative comparison between the RCI extrapolated AF estimates and those using Jarzynski's equality show 
a marked decrease in the number of work values needed to estimate AF when using the RCI estimates. RCI extrap- 
olation can obtain AF estimates using 6-15 fold less data than the Jarzysnki estimates. The linear extrapolation 
estimates tend to overshoot the best estimate AF and has a larger uncertainty than RCI extrapolation. However, the 
partial success of the "simple-minded" linear extrapolation does illustrate the power of the underlying idea: systematic 
behavior in bias can be exploited. 

Other similar extrapolation methods could be developed that may offer improvement over those presented here. 
Such methods are currenty under investigation by the authors. Future work by the authors will use extrapolation 
methods, such as those described here, to generate AF estimates for large molecular systems such as relative protein- 
ligand binding affinities. 
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